DISTREAL: Distributed Resource-Aware Learning in Heterogeneous Systems
نویسندگان
چکیده
We study the problem of distributed training neural networks (NNs) on devices with heterogeneous, limited, and time-varying availability computational resources. present an adaptive, resource-aware, on-device learning mechanism, DISTREAL, which is able to fully efficiently utilize available resources in a manner, increasing convergence speed. This achieved dropout mechanism that dynamically adjusts complexity NN by randomly dropping filters convolutional layers model. Our main contribution introduction design space exploration (DSE) technique, finds Pareto-optimal per-layer vectors respect resource requirements speed training. Applying this each device select vector fits its without requiring any assistance from server. implement our solution federated (FL) system, where varies both between over time, show through extensive evaluation we are significantly increase state art compromising final accuracy.
منابع مشابه
Multi-resource Aware Fairsharing for Heterogeneous Systems
Current production resource management and scheduling systems often use some mechanism to guarantee fair sharing of computational resources among different users of the system. For example, the user who so far consumed small amount of CPU time gets higher priority and vice versa. However, different users may have highly heterogeneous demands concerning system resources, including CPUs, RAM, HDD...
متن کاملResource-aware hybrid scheduling algorithm in heterogeneous distributed computing
Today, almost everyone is connected to the Internet and uses different Cloud solutions to store, deliver and process data. Cloud computing assembles large networks of virtualized services such as hardware and software resources. The new era inwhich ICT penetrated almost all domains (healthcare, aged-care, social assistance, surveillance, education, etc.) creates the need of newmultimedia conten...
متن کاملData - Aware Workflow Scheduling in Heterogeneous Distributed Systems
Data transferring in scientific workflows gradually attracts more attention due to large amounts of data generated by complex scientific workflows will significantly increase the turnaround time of the whole workflow. It is almost impossible to make an optimal or approximate optimal scheduling for the end-to-end workflow without considering the intermediate data movement. In order to reduce the...
متن کاملEfficient DAG Scheduling with Resource-Aware Clustering for Heterogeneous Systems
Task scheduling on Heterogeneous Distributed Computing Systems (HeDCSs) with the purpose of efficiency and reduction of execution time is of paramount importance. In this paper a novel task scheduling algorithm, called Resource-Aware Clustering (RAC) for Directed Acyclic Graphs (DAGs) is proposed. The objective of this algorithm is to keep the relative load balancing and efficiency increase bet...
متن کاملRobust Resource Allocation in Heterogeneous Parallel and Distributed Computing Systems
In parallel and distributed computing multiple computers are collectively utilized to simultaneously process a set of tasks to improve performance over that of a single processor [BSB01]. Often, such computing systems are constructed from a heterogeneous mix of machines that may differ in their capabilities, e.g., available memory, number of floating point units, clock speed, and operating syst...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2022
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v36i7.20778